Advertisement: Support JavaWorld, click here!
April 1999
HOME FEATURED TUTORIALS COLUMNS NEWS & REVIEWS FORUM JW RESOURCES ABOUT JW






ARCHIVE

TOPICAL INDEX
Core Java
Enterprise Java
Micro Java
Applied Java
Java Community

JAVA Q&A INDEX

JAVA TIPS INDEX

JavaWorld Services

Free JavaWorld newsletters

ProductFinder

Education Resources

White Paper Library

NEW! Rational Resources


XML for the absolute beginner

A guided tour from HTML to processing XML with Java


Printer-friendly version Printer-friendly version | Send this article to a friend Mail this to a friend


Page 9 of 10

Advertisement

XML and Java
Up to this point I've been laying out general information about XML, without a lot of reference to Java. Now that you understand XML, it's time to look at how to process XML in Java. Java's a great language for XML, as you'll see. It provides a portable data format that nicely complements Java's portable code.

SAX appeal
The easiest way to process an XML file in Java is by using the Simple API for XML, or SAX. SAX is a simple Java interface that many Java parsers can use. A SAX parser is a class that implements the interface org.xml.sax.Parser. This parser "walks" the tree of document nodes in an XML file, calling the methods of user-defined handler classes.

To process an XML document, the programmer creates a class that implements interface org.xml.sax.DocumentHandler. The Parser object (that is, the object that implements org.xml.sax.Parser) reads the XML from its input source, calling the methods of the DocumentHandler when tags, input strings, and so on are recognized at the input.

The methods of the DocumentHandler interface are as shown in Listing 9.


public interface DocumentHandler {

  public abstract void setDocumentLocator (Locator locator);

  public abstract void startDocument () throws SAXException;

  public abstract void endDocument () throws SAXException;

  public abstract void startElement (String name, AttributeList atts)
    throws SAXException;

  public abstract void endElement (String name) throws SAXException;

  public abstract void characters (char ch[], int start, int length)
    throws SAXException;

  public abstract void ignorableWhitespace (char ch[], int start, int length)
    throws SAXException;

  public abstract void processingInstruction (String target, String data)
    throws SAXException;
}

Listing 9. interface org.xml.sax.DocumentHandler

Package org.xml.sax includes a utility class called HandlerBase, which implements the interface in Listing 9 (as well as some other interfaces in the SAX package) with methods that do nothing. Programmers can create a subclass of HandlerBase that overrides only the methods they want to use.

For example, say we want a class that counts the elements in an XML document. We could write a class as follows:


import org.xml.sax.*;
public class ElementCounter extends HandlerBase {
    protected int _iElements = 0;

    public ElementCounter() { }

    // Each time the SAX parser encounters an element, it
    // will call this method
    public void startElement (String name, AttributeList atts)
        throws SAXException {
_iElements++;
    }

    public void endDocument() {
        System.out.println("Document contains " + _iElements +
    " elements.");
    }
};

Listing 10. A class that counts the elements in an XML document

To create a Java program that counts elements in an XML file, you'd simply create a SAX parser (how you do that depends on your particular parser package), then create an instance of your ElementCounter class. You then call the parser's setDocumentHandler method with the new ElementCounter as an argument. The parser keeps a reference to the DocumentHandler you passed to it. When you call the parser's parse() method, the parser reads its input source. Each time it encounters an element (that is, a tag) in the XML file, it calls the startElement() method of your ElementCounter object, passing the name of the tag and a list of attributes the tag may have had.

Experimenting with SAX
An example package, com.javaworld.JavaBeans.XMLApr99, can be downloaded for free (see Resources). The sample main() program lets you specify (in this order):

  1. An XML file to parse
  2. The fully specified class name of the parser (optional)
  3. The fully specified class name of a document handler

The package includes two document handlers: the ElementCounter from Listing 10, and a handler called SimplePrinter, which (naturally) simply prints the XML with an easy-to-read indentation. You can try writing your own document handler and passing it to the main method (called com.javaworld.JavaBeans.XMLApr99.ParseDemo.main()).

You'll need the JAR file called "XMLApr99.jar," and you'll need to download the JAR file for IBM's excellent "XML for Java" package (version 2). Place both JAR files in your CLASSPATH, and type


java com.javaworld.JavaBeans.XMLApr99.ParseDemo

for instructions. The XML for Java package includes excellent documentation, a programmer's guide, and several example programs to get you started.

The source code is also available in zip and tar.gz formats. As an exercise, try downloading one of the other vendors' XML parsers from the Resources section, and then overriding the method ParseDemo.createParser() in the sample code to create a parser from the new package.


Next page >
Page 1 XML for the absolute beginner
Page 2 HTML: All form and no substance
Page 3 An XML conceptual example
Page 4 Make up a markup
Page 5 So, what good is made-up markup?
Page 6 Cascading Style Sheets: not just for HTML anymore
Page 7 XSL: I like your style
Page 8 Modeling information structure in XML
Page 9 XML and Java
Page 10 Become a tree surgeon!

Printer-friendly version Printer-friendly version | Send this article to a friend Mail this to a friend



Advertisement: Support JavaWorld, click here!


HOME |  FEATURED TUTORIALS |  COLUMNS |  NEWS & REVIEWS |  FORUM |  JW RESOURCES |  ABOUT JW |  FEEDBACK

Copyright © 2003 JavaWorld.com, an IDG company